KNN-kernel density-based clustering for high-dimensional multivariate data
نویسندگان
چکیده
Density-based clustering algorithms for multivariate data often have difficulties with high-dimensional data and clusters of very different densities.A new density-based clustering algorithm, called KNNCLUST, is presented in this paper that is able to tackle these situations. It is based on the combination of nonparametric k-nearest-neighbor (KNN) and kernel (KNN-kernel) density estimation. The KNN-kernel density estimation technique makes it possible to model clusters of different densities in high-dimensional data sets. Moreover, the number of clusters is identified automatically by the algorithm. KNNCLUST is tested using simulated data and applied to a multispectral compact airborne spectrographic imager (CASI)_image of a floodplain in the Netherlands to illustrate the characteristics of the method. © 2005 Elsevier B.V. All rights reserved.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملKNN Kernel Shift Clustering with Highly Effective Memory Usage
This paper presents a novel clustering algorithm with highly effective memory usage. The algorithm, called kNN kernel shift, classifies samples based on underlying probability density function. In clustering algorithms based on density, a local mode of the density represents a cluster center. It is effective to shift each sample to a point having higher density, considering the density gradient...
متن کاملRECOME: A new density-based clustering algorithm using relative KNN kernel density
Discovering clusters from a dataset with different shapes, density, and scales is a known challenging problem in data clustering. In this paper, we propose the RElative COre MErge (RECOME) clustering algorithm. The core of RECOME is a novel density measure, i.e., Relative K nearest Neighbor Kernel Density (RNKD). RECOME identifies core objects with unit RNKD, and partitions non-core objects int...
متن کاملProjection Pursuit via Decomposition of Bias Termsof Kernel Density
Dimension reduction of data, < d ! < p (p << d), to be used for clustering has speciic requirements that are not generally met by generic dimension reduction algorithms such as principal components. Projection pursuit, on the other hand, has a growing variety of criteria that target holes, skewness, etc., using information measures, density functionals, sample moments, etc. With the exception o...
متن کاملPerformance Assessment of Kernel Density Clustering for Gene Expression Profile Data
Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 51 شماره
صفحات -
تاریخ انتشار 2006